Skip to main content

Pennsylvania State University Researchers Leverage CIROH Cyberinfrastructure for Advanced Hydrological Modeling

Β· 3 min read
Arpita Patel
DevOps Manager and Enterprise Architect
Yalan Song
Research Assistant Professor
Tadd Bindas
Graduate Researcher

Pennsylvania State University (PSU) researchers have been leveraging CIROH Cyberinfrastructure to tackle complex hydrological modeling challenges. This post highlights their innovative approach using the Wukong computing platform in conjunction with Amazon S3 bucket storage to efficiently process and analyze large-scale environmental datasets. πŸš€


πŸ’» The Computing and Storage Infrastructure​

Wukong Computing Platform​

The PSU team has been utilizing Wukong, a high-performance computing (HPC) cluster specifically designed for data-intensive scientific applications, such as the high-resolution physics-informed machine learning for national water modeling (Song et al. 2024[YS1] ). Wukong provides the computational power necessary for running complex simulations and processing large environmental datasets that traditional computing resources would struggle with. πŸ”

Key advantages of Wukong include:​

  • 🎯 Large GPU capacity for high-resolution ML/differentiable process-based models
  • βš™οΈ Scalable parallel processing capabilities
  • πŸš€ Optimized performance for data-intensive workloads
  • ⏳ Reduced processing time for big data
  • 🌐 Support for multi-node computation to handle larger geographical areas

S3 Bucket Integration:​

To complement Wukong’s computational power, the PSU researchers and AWI DevOps staff implemented Amazon S3 (Simple Storage Service) buckets as their secondary data storage solution. This integration offers several benefits:

  • πŸ—„οΈ Virtually unlimited storage capacity for growing datasets
  • πŸ”’ Data durability and redundancy
  • πŸ’° Cost-effective long-term storage, with the use of S3 intelligent tiering to automate the storage cost savings by moving data when access patterns change
  • πŸ”„ Seamless data transfer between computing nodes
  • πŸ“ Version control for dataset iterations
  • 🀝 Easy data sharing with users not on Wukong

πŸ”¬ Research Applications​

The PSU team has applied this powerful computing infrastructure to several critical research areas:

  1. National Streamflow Modeling 🌊
    Training differentiable hydrologic models with high-resolution forcing and static attribute data across extensive geographical regions using observations from thousands of gauges, followed by whole-domain forwarding.

  2. National River Routing πŸ—ΊοΈ
    Conducting river routing on MERIT/HydroFabric river networks, combined with neural network-supported routing parameter learning.

  3. NextGen Candidate Models & Data Assimilation πŸ”„
    Applying multiple NextGen candidate models and data assimilation algorithms within the differentiable modeling framework, which supports compliance with BMI.

  4. Foundation Model Development 🏞️
    Developing a foundation model to explore co-evolution between landscapes.


Thank you to all those who contributed towards this effort.​

πŸ”— Learn More​

For more details on the Wukong computing platform, check out the official documentation:
πŸ‘‰ Wukong Documentation

For the full research paper by Song et al. (2024), visit:
πŸ‘‰ DOI: 10.22541/essoar.172736277.74497104/v1